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Mitochondrial genome sequence diversity 
of Indian Plasmodium falciparum isolates 
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We have analysed the whole mitochondrial (mi) genome sequences (each ~6 kilo nucleotide base pairs in length) 
of four field isolates of the malaria parasite Plasmodium falciparum collected from different locations in India. 
Comparative genomic analyses of mt genome sequences revealed three novel India-specific single nucleotide poly- 
morphisms. In general, high mt genome diversity was found in Indian P. falciparum, at a level comparable to Af- 
rican isolates. A population phylogenetic tree placed the presently sequenced Indian P. falciparum with the global 
isolates, while a previously sequenced Indian isolate was an outlier. Although this preliminary study is limited to a 
few numbers of isolates, the data have provided fundamental evidence of the mt genome diversity and evolutionary 
relationships of Indian P. falciparum with that of global isolates. 
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Analyses of whole mitochondrial (mt) genome se- 
quences have proven to be highly relevant in inferring 
both inter and intra-specific evolutionary histories of 
many model and non-model organisms (Hedges 2000, 
Hurst & Jiggins 2005, Haag-Liautard et al. 2008) includ- 
ing the malaria parasites which have shown promising 
evolutionary outcomes (Joy et al. 2003, 2006, Mu et al. 
2005). For example, analyses of whole mt genomes in 
the two most common human malaria parasites, Plasmo- 
dium falciparum and Plasmodium vivax, have provided 
evidence of the origin and evolutionary histories of these 
species (Joy et al. 2003, 2006, Mu et al. 2005). In addition, 
useful inferences on the probable host-switching mecha- 
nism of P. falciparum, although much debated (Prugnolle 
et al. 2011), have been revealed to a certain extent. 

Malaria is highly endemic in India, with P. falcipa- 
rum causing malaria havoc (Singh et al. 2009). However, 
the mt genome sequence of Indian P. falciparum has not 
yet been analysed and compared to global mt genome se- 
quences of this species. Therefore, the evolutionary his- 
tory of global P. falciparum isolates remains incomplete 
without Indian mt genome sequence data, although two 
mt genome sequences of Indian origin are available in the 
public domain (Sharma et al. 2001, Tyagi et al. 2014). In 
order to fill this gap, we have utilised whole mt genomes 
of four P. falciparum field isolates [3 new to this study 
and 1 previously reported (Tyagi et al. 2014)] from low, 
mild and high endemic localities of India and compared 
them with the global data. We have also included the al- 
ready published mt genome sequence data of a single P. 
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falciparum isolate of unknown Indian origin (Sharma et 
al. 2001) in this study to obtain first-hand information on 
the extant diversity of Indian P. falciparum in compari- 
son to global isolates. 

We collected finger-prick blood samples from P. 
falciparum-infected malaria patients attending malaria 
clinics in three geographically distant localities of India 
with variable P. falciparum malaria epidemiology: Betul 
(state of Madhya Pradesh, high endemic), Goa (state of 
Goa, low endemic) and Mangalore (state of Karnataka, 
mild endemic). The samples were spotted (2, 3 spots) on 
Whatman filter paper, dried and transported to the labo- 
ratory in New Delhi. While the sample from Betul was 
collected by us, the samples from Goa and Mangalore 
were obtained from the parasite bank of the National In- 
stitute of Malaria Research (NIMR). The fourth sample 
came from Bilaspur [state of Chhattisgarh, high endemic 
(Tyagi et al. 2014)] (Fig. 1). This study was approved by 
NIMR, New Delhi, India, and written informed consents 
were obtained from all patients who participated in the 
study before the samples were collected. The locations of 
the sample collection sites in India are shown in Fig. 1. 

For each collected sample, genomic DNA was iso- 
lated using the QIAamp mini DNA isolation kit (Qiagen, 
Germany) according to the manufacturer's instructions. 
Because both P. falciparum and P. vivax occur in India 
in almost equal proportion (Singh et al. 2009) and the 
mt genome is conserved among different species of hu- 
man malaria parasites (Hikosaka et al. 2011), we first 
tested each sample for incidences of mixed malaria para- 
site infections using nested polymerase chain reaction 
(PCR) detection assays with genus and species-specif- 
ic primers based on the 18 S rRNA gene (Gupta et al. 
2010). In order to PCR amplify and sequence the whole 
mt genome of 5,967 nucleotide base pairs, we used 19 
recently designed primer pairs [for details of the primer 
sequences and PCR conditions, see Tyagi et al. (2014)]. 
For each individual DNA fragment, the PCR protocols 
suggested by Tyagi et al. (2014) were followed. To avoid 
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Fig. 1: map of India showing the location of collection of samples of 
Plasmodium falciparum isolates. 



infections with multiple clones of P. falciparum in a sin- 
gle patient, we sequenced all 19 DNA fragments of the 
mt genome from both directions (2X coverage, see be- 
low) and excluded isolates showing double peaks in any 
single nucleotide position (Gupta et al. 2012). For each 
single-clone P. falciparum isolate, the 19 PCR-amplified 
products were separately purified with exonuclease and 
shrimp alkaline-phosphatase following standard proto- 
cols (Tyagi et al. 2014) and DNA sequencing was per- 
formed at an NIMR in-house facility following Sanger 
sequencing technology using an ABI 3730XL DNA 
Analyser (Applied Biosystems). For each P. falciparum 
isolate, all 19 DNA fragments downstream of the se- 
quencer were individually edited using the SeqMan and 
EditSeq modules of the Lasergene v.7 computer program 
(DNASTAR, Madison, USA), respectively and the se- 
quences were manually assembled to form one complete 
mt genome. The newly generated whole mt genome se- 
quences of the three isolates were submitted to GenBank 
with accessions KJ418723, KJ418724 and KJ418725. 

To estimate mt genome diversity in Indian P. falci- 
parum and infer evolutionary patterns by comparing to 
global isolates, we have estimated several population 
genetic parameters. Specifically, the number of segre- 
gating sites, number of haplotypes, haplotype diversity 
(Hd) and a measure of nucleotide diversity (ti) (Nei 1987), 
were calculated and compared with data from other 
parts of the globe using the DnaSP v.5 computer pro- 
gram (Librado & Rozas 2009). Furthermore, to perform 
comparative and evolutionary studies of global P. falci- 
parum isolates using whole mt genome sequences, five 
Indian whole mt genome sequences were analysed [of 
which 3 were newly generated in the present study and 2, 
Blspl (Tyagi et al. 2014) (GenBank accession KJ144901) 
and PfPHlO (Sharma et al. 2001) (GenBank accession 
AJ298788) were previously reported]. In these analyses, 
one consensus sequence from each continent [Africa (n 
= 29), South America (n = 28) and Asia (n = 29)] and 



country [Papua New Guinea (PNG) (n = 10)] (Joy et al. 
2003) were generated by alignment of the whole mt ge- 
nome sequences from the respective places using the 
MegAlign module of Lasergene v.7 computer program 
(DNASTAR, Madison, USA). The 100 mt genome se- 
quences of P. falciparum utilised here for comparative 
and evolutionary studies bear the GenBank accessions 
AY282924-AY283019 and AJ276844-AJ276847. The 
whole mt genome sequence of the reference 3D7 isolate 
(GenBank accession AY282930) was also included in 
this study. Therefore, 10 sequences were utilised in this 
study [5 Indian, 1 African (consensus), 1 South Ameri- 
can (consensus), 1 PNG (consensus), 1 Asian (consensus) 
and the single 3D7 strain]. All 10 sequences were aligned 
following the CLUSTALW algorithm and a phylogenetic 
tree was constructed using the neighbour-joining (NJ) 
method implemented in MEGA v.5 computer program 
(Tamura et al. 2011). In order to estimate the strength of 
each internal node of the NJ phylogenetic tree, the tree 
topologies were simulated 1,000 times. 

Following the PCR diagnostic approach to detect 
malaria parasites and based on the peaks of the DNA 
sequence chromatogram, we were able to isolate three 
pure single-clone P. falciparum isolates (Betl2, Goa2 
and Mang2) and successfully sequence their complete 
mt genomes. Because complete mt genome sequences 
from two other Indian isolates (Blspl and PfPHlO) were 
available in the public domain (National Center for Bio- 
technology Information, available from: ncbi.nlm.nih. 
gov/), we also utilised the data from these two isolates 
(altogether 5 Indian mt genome sequences) to estimate 
mt genome sequence diversity and to infer first-hand 
evolutionary relationships of Indian P. falciparum with 
that of global isolates. The whole mt genome sequence 
alignment of five Indian isolates (Betl2, Goa2, Mang2, 
Blspl and PfPHlO) with the reference sequence from the 
3D7 strain revealed 26 variable nucleotide sites in Indian 
P. falciparum (Table I). Considering that the mt genome 
is fairly conserved across populations and species in 
Plasmodium (Hikosaka et al. 2011), the observed high 
incidences of nucleotide substitutions in Indian isolates 
was quite surprising. However, a closer look at the align- 
ment revealed that as many as 21 unique nucleotide sub- 
stitutions were present in the PfPHlO isolate (Table I). A 
very similar pattern was also observed when the PfPHlO 
isolate was compared with the Blspl and 3D7 isolates 
(Tyagi et al. 2014). Because there were such unusual 
patterns of single nucleotide mutations in the PfPHlO 
isolate, which might bias the overall outcome of the mt 
genome diversity, we did not consider the data of Pf- 
PHlO in further population genetic analyses. Therefore, 
we restricted the dataset to four mt genome sequences of 
Indian P. falciparum. Multiple alignments of mt genome 
sequences of four Indian P. falciparum isolates (Betl2, 
Goa2, Mang2 and Blspl) with the reference 3D7 isolate 
revealed five nucleotide substitutions (Table II). Out of 
these five single nucleotide polymorphisms (SNPs), four 
were found in the four Indian isolates and one in the 3D7 
isolate. Out of the four SNPs found in Indian isolates, 
two were present in the intergenic regions and two were 
in the cox I gene (Table II). The two SNPs present in 
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TABLE II 

Nucleotide sequence alignment of the whole mitochondrial 
genome sequence of the four Indian Plasmodium falciparum 
isolates (excluding the previously reported PfPHIO isolate) 
with the reference sequence of the 3D7 isolate 

Nucleotide positions 



Isolates 


276 


725 


2175 


2763 


4952 


3D7" 


G 


C 


T 


C 


T 


Blspl 4 










C 


Betl2 e 


A 


T 




T 


C 


Goa2 l 






C 




c 


Mang2< 






C 




c 



a: Joy et al. (2003); b: Tyagi et al. (2014); c: present paper; 
Only sites showing nucleotide variations in the alignment were 
shown. To be noted that due to occurrence of four single nu- 
cleotide polymorphisms in the four Indian isolates, three hap- 
lotypes are formed: GCTC (Blspl), ATTT (Betl2) and GCCC 
(Goa2 and Mang2). 



the cox I gene, however, were found to be synonymous. 
Further multiple sequence alignment of the four Indian 
mt genome sequences with the 100 published sequences 
(Conway et al. 2000, Joy et al. 2003) indicated that three 
out of the four SNPs found in Indian P. falciparum (at 
positions 276, 725 and 2763) were novel and India-spe- 
cific (data not shown). The fourth SNP (at the 2175th po- 
sition) seems to be Asian-specific in nature, as this SNP 
was never detected in any global samples and was only 
detected in some Asian isolates (Joy et al. 2003). In com- 
parison to the reference 3D7, one nucleotide substitution 
was detected in the Blspl isolate and two were found 
in each of the Goa2 and Mang2 isolates (Table II). The 
Betl2 isolate, however, contained four nucleotide sub- 
stitutions, indicating that this isolate is highly diverged 
from the rest of the three Indian isolates (Table II). The 
four SNPs segregating in the four Indian isolates pro- 
duced three haplotypes: GCTC, ATTT and GCCC (Ta- 
ble II). While the GCCC haplotype (Goa2 and Mang2) 
was segregated in 50% frequency and found in P. falci- 
parum isolates from the low and mild-malaria endemic 
localities (Goa and Mangalore), the other two haplotypes 
(ATTT and GCTC) were found to be unique, each with 
25% frequency in mt genome sequences from the two 
highly endemic localities (Betul and Bilaspur) (Table II). 
We have further estimated the values of Hd and it in four 
Indian P. falciparum isolates to be 0.833 and 0.00036, 
respectively. These estimates were higher than similar 
estimates in the whole mt genome sequences of P. fal- 
ciparum isolates sampled in Asia and South America, 
but comparable to samples from Africa (Hd = 0.865; ji = 
0.00025) (Joy et al. 2006). It is notable that P. falciparum 
malaria is not only highly endemic in Africa, but based 
on the high genetic diversity of isolates from this conti- 
nent, Africa is considered to be the homeland of this spe- 
cies (Conway et al. 2000, Joy et al. 2003). Although the 
present Indian data on whole mt genome sequences are 
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limited to only four isolates, the high mt genome diver- 
sity observed was comparable to that of Africa. These 
results corroborate earlier observations on high genet- 
ic diversity in different genes in Indian P. falciparum 
(Singh et al. 2009) and therefore justify further in-depth 
analyses with a larger number of isolates from different 
eco-climatic and malaria endemic localities in India. 

With the whole mt genome sequence information of 
Indian P. falciparum, we were interested to know the 
evolutionary relationships of global isolates using phy- 
logenetic approaches. To do this, we used single consen- 
sus sequences of whole mt genomes from Africa, Asia, 
South America, PNG, the reference 3D7 and five mt 
genomes of Tndian P. falciparum (Betl2, Goa2, Mang2, 
Blspl and PfPHIO). The NJ phylogenetic tree (Fig. 2) 
showed congregation of global P. falciparum isolates 
(Africa, Asia, South America, PNG, 3D7) with two In- 
dian sequences (Betl2 and Blspl) to form a large single 
clade with high bootstrapped values of the internal node 
(Fig. 2). However, two other Indian isolates (Goa2 and 
Mang2) were found in a small separate clade with very 
weak bootstrapped values (Fig. 2). This might be because 
both Goa2 and Mang2 isolates bear a similar haplotype 
(GCCC) (Table II) that is different from the other two 
isolates (Betl2 and Blspl). Moreover, Goa and Manga- 
lore are from the south-western part of India and are in 
close geographic proximity to each other (Fig. 1). The 
observed genetic similarities between the two geograph- 
ically close P. falciparum populations can be explained 
by the fact that P. falciparum populations often follow 
the isolation-by-distance (IBD) model of population 
structure (Tanabe et al. 2010) and are also maintained 
as genetically sub-structured populations (Anderson et 
al. 2000). Whether Indian P. falciparum populations fol- 
low the IBD and/or genetically sub-structured models 
of population structure remains to be seen with a larger 
sample size of P. falciparum isolates collected from a 
wide geographic distribution within India. 

The distant placement of the PfPHIO isolate from 
the other P. falciparum isolates in the phylogenetic tree 
(Fig. 2) was not surprising considering the fact that this 
sequence bears a large number of mutations (21 nucle- 
otide substitutions) (Table I). This corroborates earlier 
observations on the high sequence divergence of PfPHIO 
from the Blspl and the 3D7 isolates (Tyagi et al. 2014). 
Due to the unknown origin within India of the PfPHIO 
clinical isolate (Sharma et al. 2001) we are not in a posi- 
tion to discuss the observed high mt genomic differentia- 
tion among the Indian isolates, the reference 3D7 isolate 
and global isolates of P. falciparum. The placement of 
all global isolates in a single cluster supports the hypoth- 
esis of a comparatively lower polymorphic nature of mt 
genomes in relation to other recombined genomes {e.g., 
nuclear genomes) and reasserts the idealistic nature of 
mt genomes for evolutionary inferences (Conway et al. 
2000). In general, the four Indian P. falciparum isolates 
possess an appreciable level of nucleotide diversity (as 
measured by jr.) that corresponds to the African isolates. 
The observed results, although with a very limited sam- 
ple size, provide sufficient background information for 
future studies, including the sequencing more Indian 
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Fig. 2: neighbor-joining phylogenetic tree based on whole mitochon- 
drial genome sequence alignment showing genetic interrelationships 
among Plasmodium falciparum isolates from different endemic local- 
ities of the globe, a: Joy et al. (2003); b: Tyagi et al. (2014); c: present 
paper; d: Sharma et al. (2001). 



P. falciparum isolates and performing associated com- 
parative and evolutionary genomic studies. Such stud- 
ies will help to understand the population structure and 
demography of Indian P. falciparum isolates and revisit 
the evolutionary history of global P. falciparum. Most 
importantly, such studies will help discover whether the 
antimalarial drug atovaquone could be beneficial malar- 
ia chemotherapy in India. This is because the cyt b gene 
present in the P. falciparum mitochondrion is considered 
to be the target of atovaquone (Vaidya et al. 1993, Bi- 
agini et al. 2006) and low or no diversity in the cyt b 
gene in populations would be the ideal condition for us- 
ing atovaquone to treat P. falciparum malaria. 
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